A new method to measure the semantic similarity of GO terms
نویسندگان
چکیده
MOTIVATION Although controlled biochemical or biological vocabularies, such as Gene Ontology (GO) (http://www.geneontology.org), address the need for consistent descriptions of genes in different data sources, there is still no effective method to determine the functional similarities of genes based on gene annotation information from heterogeneous data sources. RESULTS To address this critical need, we proposed a novel method to encode a GO term's semantics (biological meanings) into a numeric value by aggregating the semantic contributions of their ancestor terms (including this specific term) in the GO graph and, in turn, designed an algorithm to measure the semantic similarity of GO terms. Based on the semantic similarities of GO terms used for gene annotation, we designed a new algorithm to measure the functional similarity of genes. The results of using our algorithm to measure the functional similarities of genes in pathways retrieved from the saccharomyces genome database (SGD), and the outcomes of clustering these genes based on the similarity values obtained by our algorithm are shown to be consistent with human perspectives. Furthermore, we developed a set of online tools for gene similarity measurement and knowledge discovery. AVAILABILITY The online tools are available at: http://bioinformatics.clemson.edu/G-SESAME. SUPPLEMENTARY INFORMATION http://bioinformatics.clemson.edu/Publication/Supplement/gsp.htm.
منابع مشابه
GO-terms Semantic Similarity Measures
Four methods have been presented to determine the semantic similarity of two GO terms based on the annotation statistics of their common ancestor terms (Resnik [1], Jiang [2], Lin [3] and Schlicker [4]). Wang [5] proposed a new method to measure the similarity based on the graph structure of GO. Each of these methods has its own advantages and weaknesses. GOSemSim package [6] is developed to co...
متن کاملCESSM : Collaborative Evaluation of Semantic Similarity Measures
The application of semantic similarity measures to proteins annotated with Gene Ontology terms has become a common method in bioinformatics. However, the evaluation of these measures is still challenging, since no common standard of evaluation exists. We present an online tool for the automated evaluation of GO-based semantic similarity measures, CESSM, that enables the comparison of new measur...
متن کاملSemantic Particularity Measure for Functional Characterization of Gene Sets Using Gene Ontology
BACKGROUND Genetic and genomic data analyses are outputting large sets of genes. Functional comparison of these gene sets is a key part of the analysis, as it identifies their shared functions, and the functions that distinguish each set. The Gene Ontology (GO) initiative provides a unified reference for analyzing the genes molecular functions, biological processes and cellular components. Nume...
متن کاملsimDEF: definition-based semantic similarity measure of gene ontology terms for functional similarity analysis of genes
MOTIVATION Measures of protein functional similarity are essential tools for function prediction, evaluation of protein-protein interactions (PPIs) and other applications. Several existing methods perform comparisons between proteins based on the semantic similarity of their GO terms; however, these measures are highly sensitive to modifications in the topological structure of GO, tend to be fo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Bioinformatics
دوره 23 10 شماره
صفحات -
تاریخ انتشار 2007